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Abstract 

We present a formalism for light optics starting with the Maxwell equations 
co ■ and casting them into an exact matrix form taking into account the spatial 

■ '. and temporal variations of the permittivity and permeability. This 8x8 

matrix representation is used to construct the optical Hamiltonian. This has 
a close analogy with the algebraic structure of the Dirac equation, enabling 
the use of the rich machinery of the Dirac electron theory. We get interesting 
wavelength-dependent contributions which can not be obtained in any of the 
>- . traditional approaches. 

OO 
O 

O 1 Introduction 

(N 

The traditional scalar wave theory of optics (including aberrations to all 
orders) is based on the beam-optical Hamiltonian derived using the Fermat's 
principle. This approach is purely geometrical and works adequately in the 
scalar regime. The other approach is based on the Helmholtz equation which 
is derived from the Maxwell equations. Then one makes the square-root of 
the Helmholtz operator followed by an expansion of the radical JT|, [J. This 
approach works to all orders and the resulting expansion is no different from 



5_i ■ the one obtained using the geometrical approach of the Fermat's principle. 



Another way of obtaining the aberration expansion is based on the al- 
gebraic similarities between the Helmholtz equation and the Klein-Gordon 
equation. Exploiting this algebraic similarity the Helmholtz equation is lin- 
earized in a procedure very similar to the one due to Feschbach-Villars, for 
linearizing the Klein-Gordon equation. This brings the Helmholtz equation 



to a Dirac-like form and then follows the procedure of the Foldy-Wouthuysen 
expansion used in the Dirac electron theory. This approach, which uses the 
algebraic machinery of quantum mechanics, was developed recently 0, pro- 
viding an alternative to the traditional square-root procedure. This scalar 
formalism gives rise to wavelength-dependent contributions modifying the 
aberration coefficients |4| . The algebraic machinery of this formalism is very 
similar to the one used in the quantum theory of charged-particle beam optics, 
based on the Dirac || and the Klein- Gordon || equations respectively. The 
detailed account for both of these is available in f7j. A treatment of beam 
optics taking into account the anomalous magnetic moment is available in M. 

As for the polarization: A systematic procedure for the passage from 
scalar to vector wave optics to handle paraxial beam propagation problems, 
completely taking into account the way in which the Maxwell equations cou- 
ple the spatial variation and polarization of light waves, has been formu- 
lated by analysing the basic Poincare invariance of the system, and this 
procedure has been successfully used to clarify several issues in Maxwell op- 



tics p, ra n 



In all the above approaches, the beam-optics and the polarization are 
studied separately, using very different machineries. The derivation of the 
Helmholtz equation from the Maxwell equations is an approximation as one 
neglects the spatial and temporal derivatives of the permittivity and perme- 
ability of the medium. Any prescription based on the Helmholtz equation is 
bound to be an approximation, irrespective of how good it may be in cer- 
tain situations. It is very natural to look for a prescription based fully on 
the Maxwell equations. Such a prescription is sure to provide a deeper un- 
derstanding of beam-optics and polarization in a unified manner. With this 
as the chief motivation we construct a formalism starting with the Maxwell 
equations in a matrix form: a single entity containing all the four Maxwell 
equations. 

In our approach we require an exact matrix representation of the Maxwell 
equations in a medium taking into account the spatial and temporal varia- 
tions of the permittivity and permeability. It is necessary and sufficient to 
use 8x8 matrices for such an exact representation. The derivation of the 
required matrix representation, and how it differs from the numerous other 
ones is presented in Part-I ||T2| . 

In the present Part (Part-II) we proceed with the exact matrix represen- 
tation of the Maxwell equations derived in Part-I, and construct a general 
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formalism. The derived representation has a very close algebraic correspon- 
dence with the Dirac equation. This enables us to apply the machinery of the 
Foldy-Wouthuysen expansion used in the Dirac electron theory. The Foldy- 
Wouthuysen transformation technique is outlined in Appendix-A. General 
expressions for the Hamiltonians are derived without assuming any specific 
form for the refractive index. These Hamiltonians are shown to contain the 
extra wavelength-dependent contributions which arise very naturally in our 
approach. In Part-Ill [[nj we apply the general formalism to the specific 
examples: A. Medium with Constant Refractive Index. This example is es- 
sentially for illustrating some of the details of the machinery used. 

The other application, B. Axially Symmetric Graded Index Medium is 
used to demonstrate the power of the formalism. Two points are worth 
mentioning, Image Rotation: Our formalism gives rise to the image rotation 
(proportional to the wavelength) and we have derived an explicit relationship 
for the angle of the image rotation. The other pertains to the aberrations: In 
our formalism we get all the nine aberrations permitted by the axial symme- 
try. The traditional approaches give six aberrations. Our formalism modifies 
these six aberration coefficients by wavelength-dependent contributions and 
also gives rise to the remaining three permitted by the axial symmetry. The 
existence of the nine aberrations and image rotation are well-known in axi- 
ally symmetric magnetic lenses, even when treated classically. The quantum 
treatment of the same system leads to the wavelength-dependent modifica- 
tions @. The alternate procedure for the Helmholtz optics in || |J gives the 
usual six aberrations (though modified by the wavelength-dependent contri- 
butions) and does not give any image rotation. These extra aberrations and 
the image rotation are the exclusive outcome of the fact that the formalism 
is based on the Maxwell equations, and done exactly. 

The traditional beam-optics is completely obtained from our approach in 
the limit wavelength, A — > 0, which we call as the traditional limit of our 
formalism. This is analogous to the classical limit obtained by taking h — > 
in the quantum prescriptions. The scheme of using the Foldy-Wouthuysen 
machinery in this formalism is very similar to the one used in the quantum 
theory of charged-particle beam optics |5|, |6|, [7j. There too one recovers the 
classical prescriptions in the limit Ao — > where Xq = h/p Q is the de Broglie 
wavelength and p is the design momentum of the system under study. 

The studies on the polarization are in progress. Some of the results in [|TT 



have been obtained as the lowest order approximation of the more general 
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framework presented here. These will be presented in Part-IV soon [TL4 . 



An exact matrix representation of the 
Maxwell equations in a medium 



Matrix representations of the Maxwell equations are very well-known fl5 |- 
nj. However, all these representations lack an exactness or/and are given 



in terms of a pair of matrix equations. A treatment expressing the Maxwell 
equations in a single matrix equation instead of a pair of matrix equations 
was obtained recently . This representation contains all the four Maxwell 
equations in presence of sources taking into account the spatial and temporal 
variations of the permittivity e(r,t) and the permeability p,(r,t). 

Maxwell equations (JT7I, [HJ in an inhomogeneous medium with sources are 

V-D(r,t) = p, 

d 



VxH(r,t)--D(r,t) = J, 

VxE(r,t) + ^B(r,t)=0, 
V-B(r,t) = 0. 



We assume the media to be linear, that is D = e(r, t)E, and B = p,(r, t)H, 
where e is the permittivity of the medium and [i is the permeability 
of the medium. The magnitude of the velocity of light in the medium is 
given by v(r,t) = \v(r,t)\ = l/\Je(r, t)fj,(r, t). In vacuum we have, e = 
8.85 x 10- 12 C 2 /N.m 2 and /i = 4tt x 1()- 7 N/A 2 . Following the notation 



in (16|, [L2[ we use the Riemann-Silberstein vector given by 



F ± (r,t) 
We further define, 

^ ± (r,t) = 



1 



. v e(r,t)E(r,t)±i 
V2 \ Jfi(r,t) 



B(r,t) 



(2) 



-F+ ± iF+ 
F+ 



F + ± iF+ 

x y 



w- 



J x i lJy 

J z - vp 
J z + vp 

J X i It/ y 



(3) 
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where W are the vectors for the sources. Following the notation in ]T2[ the 
exact matrix representation of the Maxwell equations is 



d_ 
dt 



I 
/ 







v{r,t) 


I 













2v(r,t) 





I 






h(r,t) 


i(3a y 






2h{r,t) 


i/3a y 










-v(r,i) 



{M-V + S-w} — i/3 (X ■ w) a y 
(E* ■ tw) a„ {M*-V + E*-u} 



J 






/ 







(4) 



where '*' denotes complex-conjugation, v 
matrices are 



§ and /i 



||. The various 
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-11 
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a = 
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1 = 





11 





11 

11 

<x 

(T 



and 1 is the 2x2 unit matrix. The triplet of the Pauli matrices, cr is 



and 

u{r,t) 
w(r, t) 
Lastly, 





' 


1 " 




' -i " 




' 1 


" 






1 







i 


, cr 2 = 





-1 





(5) 



(6) 



V«(r,t) = iv{l n«(r,t)} = 
^Vfc(r,«) = iv{lnfc(r, t )} 



Velocity Function : t>(r, t) 
Resistance Function : h(r, t) 



-V{\nn{r,t)} 



(7) 



y/e(r,t)n(r,t) 



(8) 
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As we shall see soon, it is advantageous to use the above derived functions in- 
stead of the permittivity, e(r, t) and the permeability, /x(r, t). The functions, 
v(r, t) and h(r, t) have the dimensions of velocity and resistance respectively. 

Let us consider the case without any sources {W ± = 0). We further 
assume, 



with v(r,t) = and h(r,t) = 0. Then, 



w > , 



(9) 



" M z 
M z 


d 

dz 




. ou 
v(r) 







—v(r) 



{M ± ■ V± + S • u} 

-i(3 (S* ■ w) a y 



— i/3 (S • to) 
{Ml ■ V± + £* ■ w} 



(10) 



At this stage we introduce the process of wavization, through the familiar 
Schrodinger replacement 



iAV 



P± 



-fx- 

dz 



Pz 



(11) 



where A = \/2n is the reduced wavelength, c = \u and n(r) = c/v(r) is 
the refractive index of the medium. Noting, that (pq — qp) = — iA, which 
is very similar to the commutation relation, (pq — qp) = —ih, in quantum 
mechanics. In our formalism, 'A' plays the same role which is played by the 
Planck constant, 'h' in quantum mechanics. The traditional beam-optics is 
completely obtained from our formalism in the limit A — ► 0. 

Noting, that M~ l = M z = (3, we multiply both sides of equation ([TDD by 



" M z 





-l 










M z 










(12) 



and (iA) , then, we obtain 



iA- 

dz 



FT 



(13) 
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This is the basic optical equation, where 




-n 



13 
-p 



+ £ g + O g 



4 



(n(r) - no) 



(3 
p 



P 9 



+ 



P{m ± - p± 








P {M* ± ■ p ± - iAE* • u 






A (£* • w) a y 



A (S • to) 




(14) 



where 'g' stands for grand, signifying the eight dimensions and 



I 

-I 



(15) 



The above optical Hamiltonian is exact (as exact as the Maxwell equations 
in a time-independent linear media). The approximations are made only at 
the time of doing specific calculations. Apart from the exactness, the optical 
Hamiltonian is in complete algebraic analogy with the Dirac equation with 
appropriate physical interpretations. The relevant point is: 



We note that the upper component (^/ + ) is coupled to the lower component 
( 1 J/ - ) through the logarithmic divergence of the resistance function. If this 
coupling function, w = or is approximated to be zero, then the equations 
for ( X I /+ ) and (^~) get completely decoupled, leading to two independent 
equations. Each of these two equations is equivalent to the other. These are 
the leading equations for our studies of beam-optics and polarization. In the 
optics context any contribution from the gradient of the resistance function 
can be assumed to be negligible. With this reasonable assumption we can 
decouple the equations and reduce the problem from eight dimensions to four 
dimensions. In the following sections we shall present a formalism with the 
approximation w as 0. After constructing the formalism in four dimensions 
we shall also address the question of dealing with the contributions coming 
from the gradient of the resistance function. This will require the application 




AA = 



-OgPg ■ 



(16) 
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of the Foldy-Wouthuysen transformation technique in cascade as we shall see. 
This justifies the usage of the two derived laboratory functions in place of 
permittivity and permeability respectively. 

3 The Beam-Optical Formalism 

In the previous section, starting with the Maxwell equations we presented 
the exact representation of the Maxwell equations using 8x8 matrices. From 
this representation we constructed the optical Hamiltonian having 8x8 ma- 
trices. The coupling of the upper and lower components of the corresponding 
eight-vector was neatly expressed through the logarithmic divergence of the 
laboratory function, the resistance. We reason that in the optical context 
we can safely ignore this term and reduce the problem from eight to four 
dimensions without any loss of physical content. 

We drop the t+ ' throughout and then the beam-optical Hamiltonian is 

iA^(r) = H^(r) 

H = -n {3 + S + 6 

£ = - (n (r) - n ) (3 - 1A/3S • u 

6 = i(M yPx -M xPy ) 

= P(M ± .p ± ). (17) 

If we were to neglect the derivatives of the permittivity and permeability, we 
would have missed the term, (— iA/9S • u). This is an outcome of the exact 
treatment. 

Proceeding with our analogy with the Dirac equation: this extra term 
is analogous to the anomalous magnetic/electric moment term coupled to 
the magnetic/electric field respectively in the Dirac equation. The term we 
dropped (while going from the exact to the almost-exact) is analogous to the 
anomalous magnetic/electric moment term coupled to the electric/magnetic 
fields respectively. However it should be born in mind that in our exact 
treatment, both the terms were derived from the Maxwell equations, where 
as in the Dirac theory the anomalous terms are added based on experimental 
results and certain arguments of invariances. Besides, these are the only two 
terms one gets. The term, (— iA/9S • u) is related to the polarization and we 
shall call it as the polarization term. 
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One of the other similarities worth noting, relates to the square of the 
optical Hamiltonian. 



H 2 



[n 2 (r)-pl}-\ 2 u 2 + [M ± -p ± ,n(r)] 
+2iAn(r)£ • u + iA [M± ■ p ± , £ ■ u] 

(n(r) + iAS ■ n} 2 - p\ 
+ [m_l ■ Pj. , {n (r) + iAS ■ 



(18) 



It is to be noted that the square of the Hamiltonian in our formalism differs 
from the square of the Hamiltonian in the square-root approaches |l|, |2| and 
the scalar approach in f||, |J. This is essentially the same type of difference 
which exists in the Dirac case. There too, the square of the Dirac Hamiltonian 
gives rise to extra pieces (such as, —KqH-B, the Pauli term which couples the 
spin to the magnetic field) which is absent in the Schrodinger and the Klein- 
Gordon descriptions. It is this difference in the square of the Hamiltonians 
which give rise to the various extra wavelength-dependent contributions in 
our formalism. These differences persist even in the approximation when the 
polarization term is neglected. 

Recalling, that in the traditional scalar wave theory for treating monochro- 
matic quasiparaxial light beam propagating along the positive z-axis, the z- 
evolution of the optical wave function ip{r) is taken to obey the Schrodinger- 
like equation 



and n(r) = n(x,y,z). In beam optics the rays are assumed to propagate 
almost parallel to the optic-axis, chosen to be z-axis, here. That is, |pjj <C 
1. The refractive index is the order of unity. For a medium with uniform 
refractive index, n{r) = no and the Taylor expansion of the radical is 



iA— V>(r) = H^(r) 



(19) 



where the optical Hamiltonian H is formally given by the radical 




(20) 



{n 2 (r)-p 2 ± ) 1/2 = -o{l-^pi} 



1/2 
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In the above expansion one retains terms to any desired degree of accuracy 
in powers of (jpp±)- In general the refractive index is not a constant and 
varies. The variation of the refractive index n(r), is expressed as a Taylor 
expansion in the spatial variables x, y with ^-dependent coefficients. To get 
the beam optical Hamiltonian one makes the expansion of the radical as 
before, and retains terms to the desired order of accuracy in (;^rjjj_) along 
with all the other terms (coming from the expansion of the refractive index 
n(r)) in the phase-space components up to the same order. In this expansion 
procedure the problem is partitioned into paraxial behaviour + aberrations, 
order-by-order. 

In relativistic quantum mechanics too, one has the problem of under- 
standing the behaviour in terms of nonrelativistic limit + relativistic correc- 
tions, order-by-order. In the Dirac theory of the electron this is done most 



conveniently through the Foldy-Wouthuysen transformation [19], |2(J. The 
Hamiltonian derived in fll7|) has a very close algebraic resemblance with the 
Dirac case, accompanied by the analogous physical interpretations. The de- 
tails of the analogy and the Foldy-Wouthuysen transformation are given in 
Appendix- A. 

To the leading order, that is to order, y^ip±j the beam-optical Hamilto- 
nian in terms of £ and O is formally given by 

iA^|v> = n^m, 

= - no( 3 + i-J-f3d 2 . (22) 

Note that O 2 = —p]_ and £ = — (n(r) — n ) j3 — iA/3S • u. Since, we are 
primarily interested in the forward propagation, we drop the j3 from the 
non-matrix parts of the Hamiltonian. The matrix terms are related to the 
polarization. The formal Hamiltonian in (|22"D, expressed in terms of the 
phase-space variables is: 

= _{„(r)- J-#}-iA/3E.u. (23) 
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Note that one retains terms up to quadratic in the Taylor expansion of the 
refractive index n(r) to be consistent with the order of f^jrPjY This is the 
paraxial Hamiltonian which also contains an extra matrix dependent term, 
which we call as the polarization term. Rest of it is similar to the one obtained 
in the traditional approaches. 

To go beyond the paraxial approximation one goes a step further in the 
Foldy-Wouthuysen iterative procedure. Note that, O is the order of p ± . To 

order (^xp±^ , the beam-optical Hamiltonian in terms of 8 and O is formally 
given by 



iA 



d_ 

dz 



7^(4) 



H {A) |V> , 
-n p + 8 



1 



O. 



2n< 



0,8 



-f30 2 



+4 6 

oz , 



0,8 



+ 4a 

oz , 



(24) 



Note that 6 4 = p A ±} and ^-6 = 0. The formal Hamiltonian in fl2|) when 
expressed in terms of the phase-space variables is 



#4) 



— < n r* — 



2n, 



1 -2 ! 
P± - 



o 







8n 2 n 



P± > K r ) ~ n o) 

+ 2 (p x (n(r) - n ) p x + p y (n(r) 



n 



^2 { [p* ' \Pv ' H r ) ~ n o)] + ] - [p y , [Pa, , (n(r) - n )] + ] } 
-3 {[^ , (n(r) - n )]+ + [p„ , (n(r) - n )] 2 + } 



(25) 



where [A, B] + — (AB + BA) and '• • •' are the contributions arising from the 
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presence of the polarization term. Any further simplification would require 
information about the refractive index n(r). 

Note that, the paraxial Hamiltonian ( |2"BD and the leading order aberration 
Hamiltonian ( pop differs from the ones derived in the traditional approaches. 
These differences arise by the presence of the wavelength-dependent contri- 
butions which occur in two guises. One set occurs totally independent of the 
polarization term in the basic Hamiltonian. This set is a multiple of the unit 
matrix or at most the matrix (3. The other set involves the contributions 
coming from the polarization term in the starting optical Hamiltonian. This 
gives rise to both matrix contributions and the non-matrix contributions, 
as the squares of the polarization matrices is unity. We shall discuss the 
contributions of the polarization to the beam optics elsewhere. Here, it suf- 
fices to note existence of the the wavelength-dependent contributions in two 
distinguishable guises, which are not present in the traditional prescriptions. 



4 When w ^ 



In the previous sections we assumed, w = and this enabled us to develop 
a formalism using 4x4 matrices via the Foldy-Wouthuysen machinery. The 
Foldy-Wouthuysen transformation enables us to eliminate the odd part in 
the 4x4 matrices, to any desired order of accuracy. Here too we have the 
identical problem, but a step higher in dimensions. So, we need to apply the 
Foldy-Wouthuysen to reduce the strength of the odd part in eight dimensions. 
This will reduce the problem from eight to four dimensions. 



We start with the grand optical equation in (JT^) and proceed with the 
Foldy-Wouthuysen transformations as before, but with each quantity in dou- 
ble the number of dimensions. Symbolically this means: 



H - 




if,— 


^9 


£ - 


-> £ 


6 — 




n - 


^n g = 


n 


' P 





-p 



(26) 
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The first Foldy-Wouthuysen iteration gives 



#(2) 



-n 



-n 











-/3 



/3 

o p 



P g + £ g + 



1 

2n 
1 — 



2n, 



A u? ■ it; 
















ft 



(27) 



We drop the (3 g as before and then get the following 



iA^(r) 
H 
£ 

6 



(r) 

-n /3 + £ + 

- (n (r) -no) /3- iA/3S • n + — A 2 w 2 /5 

2n 

i (M y p x - M xPy ) 
(3(M ± -p ± ) , 



(28) 



where, u> 2 = w ■ w, the square of the logarithmic gradient of the resis- 
tance function. This is how the basic optical Hamiltonian (|T7| ) gets modi- 
fied. The next degree of accuracy is achieved by going a step further in the 
Foldy-Wouthuysen iteration and obtaining the 'H^- Then, this would be 
the higher refined starting optical Hamiltonian, further modifying the basic 
optical Hamiltonian (|T7[) . This way we can apply the Foldy-Wouthuysen in 
cascade to obtain the higher order contributions coming from the logarithmic 
gradient of the resistance function, to any desired degree of accuracy. We are 
very unlikely to need any of these contributions, but it is possible to keep 
track of them. 



5 Concluding Remarks 

We start with the Maxwell equations and express them in a matrix form in 
a medium with varying permittivity and permeability in presence of sources 
using 8x8 matrices. From this exact matrix representation we construct the 
exact optical Hamiltonian for a monochromatic quasiparaxial light beam. 
The optical Hamiltonian has a very close algebraic similarity with the Dirac 
equation. We exploit this similarity to adopt the standard machinery, namely 



13 



the Foldy-Wouthuysen transformation technique of the Dirac theory. This 
enabled us to obtain the beam-optical Hamiltonian to any desired degree of 
accuracy. We further get the wavelength-dependent contributions to at each 
order, starting with the lowest-order paraxial paraxial Hamiltonian. 

The beam-optical Hamiltonians also have the wavelength-dependent ma- 
trix terms which are associated with the polarization. In this approach we 
have been able to derive a Hamiltonian which contains both the beam-optics 



and the polarization. In Part-Ill [13] we shall apply the formalism to the 



specific examples and see how the beam-optics (paraxial behaviour and the 
aberrations) gets modified by the wavelength-dependent contributions. In 
Part-IV |14J] we shall examine the polarization component of the formalism 
presented here. 

Appendix-FW 
Foldy-Wouthuysen Transformation 

In the traditional scheme the purpose of expanding the light optics Hamil- 

1 /2 

tonian H = — (n 2 {r) — p 2 A in a series using (JrPjJ as the expansion 
parameter is to understand the propagation of the quasiparaxial beam in 
terms of a series of approximations (paraxial + nonparaxial) . Similar is the 
situation in the case of the charged-particle optics. Let us recall that in rela- 
tivistic quantum mechanics too one has a similar problem of understanding 
the relativistic wave equations as the nonrelativistic approximation plus the 
relativistic correction terms in the quasirelativistic regime. For the Dirac 
equation (which is first order in time) this is done most conveniently using 
the Foldy-Wouthuysen transformation leading to an terative diagonalization 
technique. 

The main framework of the formalism of optics, used here (and in the 
charged-particle optics) is based on the transformation technique of the 
Foldy-Wouthuysen theory which casts the Dirac equation in a form display- 
ing the different interaction terms between the Dirac particle and and an ap- 
plied electromagnetic field in a nonrelativistic and easily interpretable form 
(see, J]l|-[^3|], for a general discussion of the role of the Foldy-Wouthuysen- 
type transformations in particle interpretation of relativistic wave equations). 
In the Foldy-Wouthuysen theory the Dirac equation is decoupled through a 
canonical transformation into two two-component equations: one reduces to 
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the Pauli equation in the nonrelativistic limit and the other describes the 
negative-energy states. 

Let us describe here briefly the standard Foldy-Wouthuysen theory so that 
the way it has been adopted for the purposes of the above studies in optics 
will be clear. Let us consider a charged-particle of rest-mass mo, charge q in 
the presence of an electromagnetic field characterized by E = — V0 — J^A 
and B = V x A. Then the Dirac equation is 

d 



H 



D 

£ 

6 



= H D V(r,t) 

= m c 2 j3 + q<f> + col 

= m c 2 p + £ + O 

= <1<P 

= COL • 7T , 



(A.l) 



7T 



(A.2) 



where 



OL 



a- 
a 



1 
-1 



1 = 



1 
1 



" 


1 " 




' -i " 




' 1 


" 




1 







i 







-1 





(A.3) 



with 7r = p — qA, p = — iTiV, and tt 2 

In the nonrelativistic situation the upper pair of components of the Dirac 
Spinor \P are large compared to the lower pair of components. The opera- 
tor £ which does not couple the large and small components of \& is called 
'even' and O is called an 'odd' operator which couples the large to the small 
components. Note that 



PO = -OP , p£ = £p. 



(A.4) 
U^, such that 



Now, the search is for a unitary transformation, = ^ - 
the equation for \J/' does not contain any odd operator. 

In the free particle case (with = and tt = p) such a Foldy-Wouthuysen 
transformation is given by 



Uw = 



c i§ = e 



^P , tan2\p\e 



\P\ 

m c 



(A.5) 
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This transformation eliminates the odd part completely from the free particle 
Dirac Hamiltonian reducing it to the diagonal form: 



4*' ~ 



e lS (m c 2 [3 + col ■ p^j 
(3ct ■ p 



e- [S y' 



= I cos \p\9 H — sin \p\9 j (m Q c 2 /3 + cot • p) 



\P 

x (cos \p\9 - ^yrp sin \p\9j 



= (m c 2 cos 2\p\9 + c\p\ sin 2|p|6>) (3^' 



in 



gc 4 + c 2 p 2 



(A.6) 



In the general case, when the electron is in a time-dependent electromag- 
netic field it is not possible to construct an exp(iS') which removes the odd 
operators from the transformed Hamiltonian completely. Therefore, one has 
to be content with a nonrelativistic expansion of the transformed Hamilto- 
nian in a power series in l/m c 2 keeping through any desired order. Note 
that in the nonrelativistic case, when \p\ <C m c, the transformation oper- 
ator Up — exp(iS') with S ~ — iflO /2m c 2 , where O = col ■ p is the odd 
part of the free Hamiltonian. So, in the general case we can start with the 
transformation 



Then, the equation for is 



Si = 



i(30 
2m c 2 



2m§c 



(A.7) 



dt 



8t.\ J 8t.\ 



Si 



d 

# + e iSl ( ih— 
at 



ih 



d_ 

dt 
d 



H 



D 



ih— (e iSl )e- iSl +e iSl H D e- i 
dt v ' 



Si 



JSi 



H D e-^-ihe i§ ^(e-^) 



dt 



(A.8) 
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where we have used the identity (e^) e A + e A -j^ (e A ^j = j^I = 0. 
Now, using the identities 



e A Be- A = B+[A,B] + ^[A,[A,B]] + i-[A,[A,[A,B]]] + 



2! 



3! 1 



Ait) 



I {*-*») 



dt 



l + i(t) + ii(t) 2 + ii(t) 3 . 



dt 



2! 



3! 



l + i(t) + ii(t) 2 + ii(t) 3 - 



gj(j) 1 

at J 



+A(f) 



9t J " -, 



1 

2! 



<9t 
1 

"3! 
1 

~4! 



A{t), 



A(t), 

A(t), 
A(t), 



dA(t) 
~dT 

dA{t) 
dt 

A(t), 



dA(t) 
dt 



(A.9) 



with A — iSi, we find 



n D 



1 

2! 
i 

3! 



Si, 
Si, 



Ji.iJn — — 

' 2 dt 

hdSi 



Si, H D 



3 dt 



Si, 



(A.10) 
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Substituting in flODQ , Hp 
side using the relations (30 ■ 
together, we have 



moc 2 f3 + £ + O, simplifying the right hand 
-Oj3 and {38 = £{3 and collecting everything 



H 



(i) 

D 

Si 



m c 2 {3 + £i + O x 



£ + 



2moC 2 
1 

8mjjc 6 



(30 2 - 



2^4 



o. 



ox 



+ ift- 



dt 



X 



2rriQC 2 



ox 



ih 



d& 
dt 



1 



3 



(A.ll) 



with £i and C?i obeying the relations /3C?i = —0\(3 and = £\(3 exactly 
like £ and O. It is seen that while the term O in Hp is of order zero with 
respect to the expansion parameter l/m c 2 (i.e., O = ((1 / m c 2 ) ) the 

odd part of -f^\ namely 0\, contains only terms of order 1/moC 2 and higher 
powers of 1/moc 2 (i.e., 0\ — ((l/m c 2 ))). 

To reduce the strength of the odd terms further in the transformed Hamil- 
tonian a second Foldy-Wouthuysen transformation is applied with the same 
prescription: 



#( 2 ) 
S 2 



e i^(D ) 

2m,QC 2 

iP_ 

2itlqc 2 

After this transformation, 







2itlqc 2 



OX 



+ ih 



dO\ 



- - 



dt J 3mgc 4 



o ? - 



(A.12) 



dt 



H 



£i 



£i, O 



(2) 
D 





m c 2 [3 + £ 2 + 2 



2rriQC 2 



+ ih 



dOt 



dt 



where, now, 2 



O ((l/m c 2 ) 2 

^(3) = e iS 3 ^(2) 



O x ,£ x 

. After the third transformation 
i(30 2 



(A.13) 



2m n c 2 



(A. 14) 
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we have 



ift|-* (3) = H^¥ 3 \ = m c 2 ft + £ 3 + O 3 



dt 



ft 



2mQC 2 



2 ,£ 2 



+ ih 



dQ 2 
dt 



(A.15) 



where 3 = O ((l/m c 2 ) 3 ). So, neglecting O, 



H 



(3) 
D 



moc 2 ft + £ 



1 
1 



2mQC 



:ftC 2 



o, 



0,£ 



+ ih 



{ -ft G 4 + 0,£ 



(A.16) 



It may be noted that starting with the second transformation successive 
(£, O) pairs can be obtained recursively using the rule 



6, 



E x [£^£^ u O^O^ Xj 

d 1 (i^i j - 1 ,d^6 j - 1 ) , j>i 



(A.17) 



and retaining only the relevant terms of desired order at each step. 

With £ = q<fi and O = col ■ tc, the final reduced Hamiltonian ( |A.16|) is, to 
the order calculated, 



H 



(3) 

D 



ft yrriQC 2 - 
iqh 2 



7T 



P 



2m n 



+ 



2m c 



/3E ■ B 



£ ■ curl E - 



qh 



£ ■ E x p 



qh 2 



-divE 



•HTIqC 



(A.18) 



with the individual terms having direct physical interpretations. The terms 



in the first parenthesis result from the expansion of ymgC 4 + c 2 7r 2 showing 
the effect of the relativistic mass increase. The second and third terms are 
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the electrostatic and magnetic dipole energies. The next two terms, taken 
together (for hermiticity) , contain the spin-orbit interaction. The last term, 
the so-called Darwin term, is attributed to the zitterbewegung (trembling 
motion) of the Dirac particle: because of the rapid coordinate fluctuations 
over distances of the order of the Compton wavelength (27rh/moc) the particle 
sees a somewhat smeared out electric potential. 

It is clear that the Foldy-Wouthuysen transformation technique expands 
the Dirac Hamiltonian as a power series in the parameter l/m^c 2 enabling the 
use of a systematic approximation procedure for studying the deviations from 
the nonrelativistic situation. We note the analogy between the nonrelativistic 
particle dynamics and paraxial optics: 

The Analogy 

Standard Dirac Equation Beam Optical Form 

m c 2 p + £ D + 6 D -n p + £ + 6 

m Q c 2 —n 

Positive Energy Forward Propagation 

Nonrelativistic, | tt | <C m c Paraxial Beam, <C n 

Non relativistic Motion Paraxial Behavior 

+ Relativistic Corrections + Aberration Corrections 

Noting the above analogy, the idea of Foldy-Wouthuysen form of the Dirac 
theory has been adopted to study the paraxial optics and deviations from it 
by first casting the Maxwell equations in a spinor form resembling exactly the 
Dirac equation ( [A.l , |A.2j ) in all respects: i.e., a multicomponent \l/ having 



the upper half of its components large compared to the lower components 
and the Hamiltonian having an even part (£), an odd part {O), a suitable 
expansion parameter, (|p_i_|/no "C 1) characterizing the dominant forward 
propagation and a leading term with a (3 coefficient commuting with 8 and 
anticommuting with O. The additional feature of our formalism is to return 
finally to the original representation after making an extra approximation, 
dropping (3 from the final reduced optical Hamiltonian, taking into account 
the fact that we are primarily interested only in the forward-propagating 
beam. 
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